skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Huang, Yan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract ObjectiveSNOMED CT provides a standardized terminology for clinical concepts, allowing cohort queries over heterogeneous clinical data including Electronic Health Records (EHRs). While it is intuitive that missing and inaccurate subtype (or is-a) relations in SNOMED CT reduce the recall and precision of cohort queries, the extent of these impacts has not been formally assessed. This study fills this gap by developing quantitative metrics to measure these impacts and performing statistical analysis on their significance. Material and MethodsWe used the Optum de-identified COVID-19 Electronic Health Record dataset. We defined micro-averaged and macro-averaged recall and precision metrics to assess the impact of missing and inaccurate is-a relations on cohort queries. Both practical and simulated analyses were performed. Practical analyses involved 407 missing and 48 inaccurate is-a relations confirmed by domain experts, with statistical testing using Wilcoxon signed-rank tests. Simulated analyses used two random sets of 400 is-a relations to simulate missing and inaccurate is-a relations. ResultsWilcoxon signed-rank tests from both practical and simulated analyses (P-values < .001) showed that missing is-a relations significantly reduced the micro- and macro-averaged recall, and inaccurate is-a relations significantly reduced the micro- and macro-averaged precision. DiscussionThe introduced impact metrics can assist SNOMED CT maintainers in prioritizing critical hierarchical defects for quality enhancement. These metrics are generally applicable for assessing the quality impact of a terminology’s subtype hierarchy on its cohort query applications. ConclusionOur results indicate a significant impact of missing and inaccurate is-a relations in SNOMED CT on the recall and precision of cohort queries. Our work highlights the importance of high-quality terminology hierarchy for cohort queries over EHR data and provides valuable insights for prioritizing quality improvements of SNOMED CT's hierarchy. 
    more » « less
  2. Dynamic protein structures are crucial for deciphering their diverse biological functions. Two-dimensional infrared (2DIR) spectroscopy stands as an ideal tool for tracing rapid conformational evolutions in proteins. However, linking spectral characteristics to dynamic structures poses a formidable challenge. Here, we present a pretrained machine learning model based on 2DIR spectra analysis. This model has learned signal features from approximately 204,300 spectra to establish a “spectrum-structure” correlation, thereby tracing the dynamic conformations of proteins. It excels in accurately predicting the dynamic content changes of various secondary structures and demonstrates universal transferability on real folding trajectories spanning timescales from microseconds to milliseconds. Beyond exceptional predictive performance, the model offers attention-based spectral explanations of dynamic conformational changes. Our 2DIR-based pretrained model is anticipated to provide unique insights into the dynamic structural information of proteins in their native environments. 
    more » « less
  3. Research on the impact of seawater intrusion on nitrogen (N) cycling in coastal estuarine ecosystems is crucial; however, there is still a lack of relevant research conducted underin-situfield conditions. The effects of elevated salinity on N cycling processes and microbiomes were examinedin situseawater intrusion experiments conducted from 2019 to 2021 in the Nakdong River Estuary (South Korea), where an estuarine dam regulates tidal hydrodynamics. After the opening of the Nakdong Estuary Dam (seawater intrusion event), the density difference between seawater and freshwater resulted in varying degrees of seawater trapping at topographically deep stations. Bottom-water oxygen conditions had been altered in normoxia, hypoxia, and weak hypoxia due to the different degrees of seawater trapping in 2019, 2020, and 2021, respectively. Denitrification mostly dominated the nitrate (NO3-) reduction process, except in 2020 after seawater intrusion. However, denitrification rates decreased because of reduced coupled nitrification after seawater intrusion due to the dissolved oxygen limitation in 2020. Dissimilatory nitrate reduction to ammonium (DNRA) rates immediately increased after seawater intrusion in 2020, replacing denitrification as the dominant pathway in the NO3-reduction process. The enhanced DNRA rate was mainly due to the abundant organic matter associated with seawater invasion and more reducing environment (maybe sulfide enhancement effects) under high seawater-trapping conditions. Denitrification increased in 2021 after seawater intrusion during weak hypoxia; however, DNRA did not change. Small seawater intrusion in 2019 caused no seawater trapping and overall normoxic condition, though a slight shift from denitrification to DNRA was observed. Metagenomic analysis revealed a decrease in overall denitrification-associated genes in response to seawater intrusion in 2019 and 2020, while DNRA-associated gene abundance increased. In 2021 after seawater intrusion, microbial gene abundance associated with denitrification increased, while that of DNRA did not change significantly. These changes in gene abundance align mostly with alterations in nitrogen transformation rates. In summary, ecological change effects in N cycling after the dam opening (N retention or release, that is, eutrophication deterioration or mitigation) depend on the degree of seawater intrusion and the underlying freshwater conditions, which constitute the extent of seawater-trapping. 
    more » « less
  4. Recent advances in Artificial Intelligence (AI) have brought society closer to the long-held dream of creating machines to help with both common and complex tasks and functions. From recommending movies to detecting disease in its earliest stages, AI has become an aspect of daily life many people accept without scrutiny. Despite its functionality and promise, AI has inherent security risks that users should understand and programmers must be trained to address. The ICE (integrity, confidentiality, and equity) cybersecurity labs developed by a team of cybersecurity researchers addresses these vulnerabilities to AI models through a series of hands-on, inquiry-based labs. Through experimenting with and manipulating data models, students can experience firsthand how adversarial samples and bias can degrade the integrity, confidentiality, and equity of deep learning neural networks, as well as implement security measures to mitigate these vulnerabilities. This article addresses the pedagogical approach underpinning the ICE labs, and discusses both sample activities and technological considerations for teachers who want to implement these labs with their students. 
    more » « less
  5. Recent advances in Artificial Intelligence (AI) have brought society closer to the long-held dream of creating machines to help with both common and complex tasks and functions. From recommending movies to detecting disease in its earliest stages, AI has become an aspect of daily life many people accept without scrutiny. Despite its functionality and promise, AI has inherent security risks that users should understand and programmers must be trained to address. The ICE (integrity, confidentiality, and equity) cybersecurity labs developed by a team of cybersecurity researchers addresses these vulnerabilities to AI models through a series of hands-on, inquiry-based labs. Through experimenting with and manipulating data models, students can experience firsthand how adversarial samples and bias can degrade the integrity, confidentiality, and equity of deep learning neural networks, as well as implement security measures to mitigate these vulnerabilities. This article addresses the pedagogical approach underpinning the ICE labs, and discusses both sample activities and technological considerations for teachers who want to implement these labs with their students. 
    more » « less
  6. Recent advances in Artificial Intelligence (AI) have brought society closer to the long-held dream of creating machines to help with both common and complex tasks and functions. From recommending movies to detecting disease in its earliest stages, AI has become an aspect of daily life many people accept without scrutiny. Despite its functionality and promise, AI has inherent security risks that users should understand and programmers must be trained to address. The ICE (integrity, confidentiality, and equity) cybersecurity labs developed by a team of cybersecurity researchers addresses these vulnerabilities to AI models through a series of hands-on, inquiry-based labs. Through experimenting with and manipulating data models, students can experience firsthand how adversarial samples and bias can degrade the integrity, confidentiality, and equity of deep learning neural networks, as well as implement security measures to mitigate these vulnerabilities. This article addresses the pedagogical approach underpinning the ICE labs, and discusses both sample activities and technological considerations for teachers who want to implement these labs with their students. 
    more » « less